agentdb learning plugins

安装量: 41
排名: #17532

安装

npx skills add https://github.com/proffesor-for-testing/agentic-qe --skill 'AgentDB Learning Plugins'
AgentDB Learning Plugins
What This Skill Does
Provides access to 9 reinforcement learning algorithms via AgentDB's plugin system. Create, train, and deploy learning plugins for autonomous agents that improve through experience. Includes offline RL (Decision Transformer), value-based learning (Q-Learning), policy gradients (Actor-Critic), and advanced techniques.
Performance
Train models 10-100x faster with WASM-accelerated neural inference. Prerequisites Node.js 18+ AgentDB v1.0.7+ (via agentic-flow) Basic understanding of reinforcement learning (recommended) Quick Start with CLI Create Learning Plugin

Interactive wizard

npx agentdb@latest create-plugin

Use specific template

npx agentdb@latest create-plugin -t decision-transformer -n my-agent

Preview without creating

npx agentdb@latest create-plugin -t q-learning --dry-run

Custom output directory

npx agentdb@latest create-plugin -t actor-critic -o ./plugins List Available Templates

Show all plugin templates

npx agentdb@latest list-templates

Available templates:

- decision-transformer (sequence modeling RL - recommended)

- q-learning (value-based learning)

- sarsa (on-policy TD learning)

- actor-critic (policy gradient with baseline)

- curiosity-driven (exploration-based)

Manage Plugins

List installed plugins

npx agentdb@latest list-plugins

Get plugin information

npx agentdb@latest plugin-info my-agent

Shows: algorithm, configuration, training status

Quick Start with API
import
{
createAgentDBAdapter
}
from
'agentic-flow/reasoningbank'
;
// Initialize with learning enabled
const
adapter
=
await
createAgentDBAdapter
(
{
dbPath
:
'.agentdb/learning.db'
,
enableLearning
:
true
,
// Enable learning plugins
enableReasoning
:
true
,
cacheSize
:
1000
,
}
)
;
// Store training experience
await
adapter
.
insertPattern
(
{
id
:
''
,
type
:
'experience'
,
domain
:
'game-playing'
,
pattern_data
:
JSON
.
stringify
(
{
embedding
:
await
computeEmbedding
(
'state-action-reward'
)
,
pattern
:
{
state
:
[
0.1
,
0.2
,
0.3
]
,
action
:
2
,
reward
:
1.0
,
next_state
:
[
0.15
,
0.25
,
0.35
]
,
done
:
false
}
}
)
,
confidence
:
0.9
,
usage_count
:
1
,
success_count
:
1
,
created_at
:
Date
.
now
(
)
,
last_used
:
Date
.
now
(
)
,
}
)
;
// Train learning model
const
metrics
=
await
adapter
.
train
(
{
epochs
:
50
,
batchSize
:
32
,
}
)
;
console
.
log
(
'Training Loss:'
,
metrics
.
loss
)
;
console
.
log
(
'Duration:'
,
metrics
.
duration
,
'ms'
)
;
Available Learning Algorithms (9 Total)
1. Decision Transformer (Recommended)
Type
Offline Reinforcement Learning
Best For
Learning from logged experiences, imitation learning
Strengths
No online interaction needed, stable training
npx agentdb@latest create-plugin
-t
decision-transformer
-n
dt-agent
Use Cases
:
Learn from historical data
Imitation learning from expert demonstrations
Safe learning without environment interaction
Sequence modeling tasks
Configuration
:
{
"algorithm"
:
"decision-transformer"
,
"model_size"
:
"base"
,
"context_length"
:
20
,
"embed_dim"
:
128
,
"n_heads"
:
8
,
"n_layers"
:
6
}
2. Q-Learning
Type
Value-Based RL (Off-Policy)
Best For
Discrete action spaces, sample efficiency
Strengths
Proven, simple, works well for small/medium problems
npx agentdb@latest create-plugin
-t
q-learning
-n
q-agent
Use Cases
:
Grid worlds, board games
Navigation tasks
Resource allocation
Discrete decision-making
Configuration
:
{
"algorithm"
:
"q-learning"
,
"learning_rate"
:
0.001
,
"gamma"
:
0.99
,
"epsilon"
:
0.1
,
"epsilon_decay"
:
0.995
}
3. SARSA
Type
Value-Based RL (On-Policy)
Best For
Safe exploration, risk-sensitive tasks
Strengths
More conservative than Q-Learning, better for safety
npx agentdb@latest create-plugin
-t
sarsa
-n
sarsa-agent
Use Cases
:
Safety-critical applications
Risk-sensitive decision-making
Online learning with exploration
Configuration
:
{
"algorithm"
:
"sarsa"
,
"learning_rate"
:
0.001
,
"gamma"
:
0.99
,
"epsilon"
:
0.1
}
4. Actor-Critic
Type
Policy Gradient with Value Baseline
Best For
Continuous actions, variance reduction
Strengths
Stable, works for continuous/discrete actions
npx agentdb@latest create-plugin
-t
actor-critic
-n
ac-agent
Use Cases
:
Continuous control (robotics, simulations)
Complex action spaces
Multi-agent coordination
Configuration
:
{
"algorithm"
:
"actor-critic"
,
"actor_lr"
:
0.001
,
"critic_lr"
:
0.002
,
"gamma"
:
0.99
,
"entropy_coef"
:
0.01
}
5. Active Learning
Type
Query-Based Learning
Best For
Label-efficient learning, human-in-the-loop
Strengths
Minimizes labeling cost, focuses on uncertain samples
Use Cases
:
Human feedback incorporation
Label-efficient training
Uncertainty sampling
Annotation cost reduction
6. Adversarial Training
Type
Robustness Enhancement
Best For
Safety, robustness to perturbations
Strengths
Improves model robustness, adversarial defense
Use Cases
:
Security applications
Robust decision-making
Adversarial defense
Safety testing
7. Curriculum Learning
Type
Progressive Difficulty Training
Best For
Complex tasks, faster convergence
Strengths
Stable learning, faster convergence on hard tasks
Use Cases
:
Complex multi-stage tasks
Hard exploration problems
Skill composition
Transfer learning
8. Federated Learning
Type
Distributed Learning
Best For
Privacy, distributed data
Strengths
Privacy-preserving, scalable
Use Cases
:
Multi-agent systems
Privacy-sensitive data
Distributed training
Collaborative learning
9. Multi-Task Learning
Type
Transfer Learning
Best For
Related tasks, knowledge sharing
Strengths
Faster learning on new tasks, better generalization Use Cases : Task families Transfer learning Domain adaptation Meta-learning Training Workflow 1. Collect Experiences // Store experiences during agent execution for ( let i = 0 ; i < numEpisodes ; i ++ ) { const episode = runEpisode ( ) ; for ( const step of episode . steps ) { await adapter . insertPattern ( { id : '' , type : 'experience' , domain : 'task-domain' , pattern_data : JSON . stringify ( { embedding : await computeEmbedding ( JSON . stringify ( step ) ) , pattern : { state : step . state , action : step . action , reward : step . reward , next_state : step . next_state , done : step . done } } ) , confidence : step . reward

0 ? 0.9 : 0.5 , usage_count : 1 , success_count : step . reward

0 ? 1 : 0 , created_at : Date . now ( ) , last_used : Date . now ( ) , } ) ; } } 2. Train Model // Train on collected experiences const trainingMetrics = await adapter . train ( { epochs : 100 , batchSize : 64 , learningRate : 0.001 , validationSplit : 0.2 , } ) ; console . log ( 'Training Metrics:' , trainingMetrics ) ; // { // loss: 0.023, // valLoss: 0.028, // duration: 1523, // epochs: 100 // } 3. Evaluate Performance // Retrieve similar successful experiences const testQuery = await computeEmbedding ( JSON . stringify ( testState ) ) ; const result = await adapter . retrieveWithReasoning ( testQuery , { domain : 'task-domain' , k : 10 , synthesizeContext : true , } ) ; // Evaluate action quality const suggestedAction = result . memories [ 0 ] . pattern . action ; const confidence = result . memories [ 0 ] . similarity ; console . log ( 'Suggested Action:' , suggestedAction ) ; console . log ( 'Confidence:' , confidence ) ; Advanced Training Techniques Experience Replay // Store experiences in buffer const replayBuffer = [ ] ; // Sample random batch for training const batch = sampleRandomBatch ( replayBuffer , batchSize : 32 ) ; // Train on batch await adapter . train ( { data : batch , epochs : 1 , batchSize : 32 , } ) ; Prioritized Experience Replay // Store experiences with priority (TD error) await adapter . insertPattern ( { // ... standard fields confidence : tdError , // Use TD error as confidence/priority // ... } ) ; // Retrieve high-priority experiences const highPriority = await adapter . retrieveWithReasoning ( queryEmbedding , { domain : 'task-domain' , k : 32 , minConfidence : 0.7 , // Only high TD-error experiences } ) ; Multi-Agent Training // Collect experiences from multiple agents for ( const agent of agents ) { const experience = await agent . step ( ) ; await adapter . insertPattern ( { // ... store experience with agent ID domain : multi-agent/ ${ agent . id } , } ) ; } // Train shared model await adapter . train ( { epochs : 50 , batchSize : 64 , } ) ; Performance Optimization Batch Training // Collect batch of experiences const experiences = collectBatch ( size : 1000 ) ; // Batch insert (500x faster) for ( const exp of experiences ) { await adapter . insertPattern ( { / ... / } ) ; } // Train on batch await adapter . train ( { epochs : 10 , batchSize : 128 , // Larger batch for efficiency } ) ; Incremental Learning // Train incrementally as new data arrives setInterval ( async ( ) => { const newExperiences = getNewExperiences ( ) ; if ( newExperiences . length

100 ) { await adapter . train ( { epochs : 5 , batchSize : 32 , } ) ; } } , 60000 ) ; // Every minute Integration with Reasoning Agents Combine learning with reasoning for better performance: // Train learning model await adapter . train ( { epochs : 50 , batchSize : 32 } ) ; // Use reasoning agents for inference const result = await adapter . retrieveWithReasoning ( queryEmbedding , { domain : 'decision-making' , k : 10 , useMMR : true , // Diverse experiences synthesizeContext : true , // Rich context optimizeMemory : true , // Consolidate patterns } ) ; // Make decision based on learned experiences + reasoning const decision = result . context . suggestedAction ; const confidence = result . memories [ 0 ] . similarity ; CLI Operations

Create plugin

npx agentdb@latest create-plugin -t decision-transformer -n my-plugin

List plugins

npx agentdb@latest list-plugins

Get plugin info

npx agentdb@latest plugin-info my-plugin

List templates

npx agentdb@latest list-templates Troubleshooting Issue: Training not converging // Reduce learning rate await adapter . train ( { epochs : 100 , batchSize : 32 , learningRate : 0.0001 , // Lower learning rate } ) ; Issue: Overfitting // Use validation split await adapter . train ( { epochs : 50 , batchSize : 64 , validationSplit : 0.2 , // 20% validation } ) ; // Enable memory optimization await adapter . retrieveWithReasoning ( queryEmbedding , { optimizeMemory : true , // Consolidate, reduce overfitting } ) ; Issue: Slow training

Enable quantization for faster inference

Use binary quantization (32x faster)

Learn More
Algorithm Papers
See docs/algorithms/ for detailed papers
GitHub
:
https://github.com/ruvnet/agentic-flow/tree/main/packages/agentdb
MCP Integration
:
npx agentdb@latest mcp
Website
:
https://agentdb.ruv.io
Category
Machine Learning / Reinforcement Learning
Difficulty
Intermediate to Advanced
Estimated Time
30-60 minutes
返回排行榜